Show code
pacman::p_load(arrow, lubridate, sf, tidyverse, spNetwork, tmap,
spatstat, raster, maptools)February 3, 2024
We will be applying appropriate spatial point patterns analysis methods learned in class to discover the geographical and spatio-temporal distribution of Grab hailing services locations in Singapore.
The R packages that we will be using in this exercise are as follows:
arrow: For reading parquet files (Grab-Posisi Dataset)
lubridate: To handle the date formatting
sf: Import, manage and process vector-based geospatial data in R.
tidyverse: a collection of packages for data science tasks
spatstat: Wide range of useful functions for point pattern analysis and derive kernel density estimation (KDE) layer.
spNetwork: provides functions to perform Spatial Point Patterns Analysis such as kernel density estimation (KDE) and K-function on network. It also can be used to build spatial matrices (‘listw’ objects like in ‘spdep’ package) to conduct any kind of traditional spatial analysis with spatial weights based on reticular distances.
tmap: Provides functions for plotting cartographic quality static point patterns maps or interactive maps by using leaflet API.
raster: reads, writes, manipulates, analyses and model of gridded spatial data (i.e. raster). In this hands-on exercise, it will be used to convert image output generate by spatstat into raster format.
maptools: Provides a set of tools for manipulating geographic data. In this take-home exercise, we mainly use it to convert Spatial objects into ppp format of spatstat.
# classInt, viridis, rgdal
The datasets that we will be using are as follow:
Using read_parquet() function from arrow package to import the grab data, then changing pingtimestamp column to datetime object
Transforming the coordinate system at the same time when we are importing the data
Reading layer `gis_osm_roads_free_1' from data source
`/Users/jacksontan/Documents/Sashimii0219/IS415-GAA/Take-home_Ex/Take-home_Ex01/data/geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 1759836 features and 10 fields
Geometry type: LINESTRING
Dimension: XY
Bounding box: xmin: 99.66041 ymin: 0.8021131 xmax: 119.2601 ymax: 7.514393
Geodetic CRS: WGS 84
Transforming the coordinate system at the same time when we are importing the data
Reading layer `MPSZ-2019' from data source
`/Users/jacksontan/Documents/Sashimii0219/IS415-GAA/Take-home_Ex/Take-home_Ex01/data/geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS: WGS 84
Before we begin exploring the data, we will first need to perform some data pre-processing on the datasets that we have imported.
As grab won’t be able to reach offshore places, we will exclude the outer islands from this dataset. We will do this through the following steps:
We will first take a look at the unique planning areas in Singapore using unique() on the PLN_AREA_N column of mpsz2019 dataset.
[1] "MARINA EAST" "RIVER VALLEY"
[3] "SINGAPORE RIVER" "WESTERN ISLANDS"
[5] "MUSEUM" "MARINE PARADE"
[7] "SOUTHERN ISLANDS" "BUKIT MERAH"
[9] "DOWNTOWN CORE" "STRAITS VIEW"
[11] "QUEENSTOWN" "OUTRAM"
[13] "MARINA SOUTH" "ROCHOR"
[15] "KALLANG" "TANGLIN"
[17] "NEWTON" "CLEMENTI"
[19] "BEDOK" "PIONEER"
[21] "JURONG EAST" "ORCHARD"
[23] "GEYLANG" "BOON LAY"
[25] "BUKIT TIMAH" "NOVENA"
[27] "TOA PAYOH" "TUAS"
[29] "JURONG WEST" "SERANGOON"
[31] "BISHAN" "TAMPINES"
[33] "BUKIT BATOK" "HOUGANG"
[35] "CHANGI BAY" "PAYA LEBAR"
[37] "ANG MO KIO" "PASIR RIS"
[39] "BUKIT PANJANG" "TENGAH"
[41] "SELETAR" "SUNGEI KADUT"
[43] "YISHUN" "MANDAI"
[45] "PUNGGOL" "CHOA CHU KANG"
[47] "SENGKANG" "CHANGI"
[49] "CENTRAL WATER CATCHMENT" "SEMBAWANG"
[51] "WESTERN WATER CATCHMENT" "WOODLANDS"
[53] "NORTH-EASTERN ISLANDS" "SIMPANG"
[55] "LIM CHU KANG"

Note that there are 3 areas with island in their name, mainly “NORTH-EASTERN ISLANDS”, “SOUTHERN ISLANDS”, and “WESTERN ISLANDS”.
To exclude the islands, we simply have to pass a condition to exclude these islands in the subset function.
We will be using the st_is_valid() function to test for invalid geometries.
[1] 3
[1] "Ring Self-intersection[26922.5243000389 27027.610899987]"
[2] "Ring Self-intersection[38991.2589000446 31986.5599999869]"
[3] "Ring Self-intersection[14484.6860000313 31330.1319999856]"
We can see that there are 3 invalid geometries. Let’s fix them using st_make_valid().
Simple feature collection with 0 features and 6 fields
Bounding box: xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
[1] SUBZONE_N SUBZONE_C PLN_AREA_N PLN_AREA_C REGION_N REGION_C geometry
<0 rows> (or 0-length row.names)
Using the code above, we can see that there are no missing values.
As the dataset contains data from Malaysia and Brunei as well, we will use st_intersection() to limit the data to only Singapore.
Now, we can see that in points_within_sg it only contain Singapore road data, combined with the other values from mpsz2019 like “PLN_AREA_N” used above.
[1] "osm_id" "code" "fclass" "name" "ref"
[6] "oneway" "maxspeed" "layer" "bridge" "tunnel"
[11] "SUBZONE_N" "SUBZONE_C" "PLN_AREA_N" "PLN_AREA_C" "REGION_N"
[16] "REGION_C" "geometry"
Simple feature collection with 6 features and 16 fields
Geometry type: LINESTRING
Dimension: XY
Bounding box: xmin: 31466.72 ymin: 30680.54 xmax: 32815.21 ymax: 30873.74
Projected CRS: SVY21 / Singapore TM
osm_id code fclass name ref oneway maxspeed layer
4052 23946437 5122 residential Rhu Cross <NA> F 50 0
9668 32605139 5131 motorway_link <NA> <NA> F 40 0
20076 46337834 5131 motorway_link <NA> <NA> F 50 -2
21690 49961799 5111 motorway East Coast Parkway ECP F 70 1
26543 74722808 5111 motorway East Coast Parkway ECP F 70 1
29808 99007260 5131 motorway_link <NA> <NA> F 50 1
bridge tunnel SUBZONE_N SUBZONE_C PLN_AREA_N PLN_AREA_C REGION_N
4052 F F MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
9668 F F MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
20076 F T MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
21690 F F MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
26543 T F MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
29808 T F MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
REGION_C geometry
4052 CR LINESTRING (31889.45 30760....
9668 CR LINESTRING (32768.57 30857....
20076 CR LINESTRING (32815.21 30873....
21690 CR LINESTRING (32365.45 30845....
26543 CR LINESTRING (31611.63 30720....
29808 CR LINESTRING (31611.63 30720....
Again, using the st_is_valid() function to test for invalid geometries.
[1] 0
character(0)
No invalid geometries!
Simple feature collection with 232766 features and 16 fields
Geometry type: GEOMETRY
Dimension: XY
Bounding box: xmin: 2679.373 ymin: 23099.51 xmax: 50957.8 ymax: 50220.06
Projected CRS: SVY21 / Singapore TM
First 10 features:
osm_id code fclass name ref oneway maxspeed layer
4052 23946437 5122 residential Rhu Cross <NA> F 50 0
9668 32605139 5131 motorway_link <NA> <NA> F 40 0
20076 46337834 5131 motorway_link <NA> <NA> F 50 -2
29808 99007260 5131 motorway_link <NA> <NA> F 50 1
45723 140562813 5131 motorway_link <NA> <NA> F 70 -1
45728 140562819 5131 motorway_link <NA> <NA> F 50 0
45731 140562823 5131 motorway_link <NA> <NA> F 60 -2
45733 140562826 5131 motorway_link <NA> <NA> F 40 0
52966 150819034 5141 service Bay East Drive <NA> B 0 0
84664 174717984 5153 footway <NA> <NA> B 0 0
bridge tunnel SUBZONE_N SUBZONE_C PLN_AREA_N PLN_AREA_C REGION_N
4052 F F MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
9668 F F MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
20076 F T MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
29808 T F MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
45723 F F MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
45728 F F MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
45731 F T MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
45733 F F MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
52966 F F MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
84664 F F MARINA EAST MESZ01 MARINA EAST ME CENTRAL REGION
REGION_C geometry
4052 CR LINESTRING (31889.45 30760....
9668 CR LINESTRING (32768.57 30857....
20076 CR LINESTRING (32815.21 30873....
29808 CR LINESTRING (31611.63 30720....
45723 CR LINESTRING (32782.42 30754....
45728 CR LINESTRING (32645.37 30683....
45731 CR LINESTRING (32809.68 30108....
45733 CR LINESTRING (32609.11 30700....
52966 CR LINESTRING (32173.46 30036....
84664 CR LINESTRING (31750.06 30644....
By using the code above, we can see that majority of the missing values are in the ‘name’ and ‘ref’ column. Therefore, let’s drop the irrelevant columns first before we try it again!
We only kept “osm_id”, “code”, “fclass”, and “PLN_AREA_N” columns.
Simple feature collection with 0 features and 4 fields
Bounding box: xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
[1] osm_id code fclass PLN_AREA_N geometry
<0 rows> (or 0-length row.names)
No more missing values here.
Our map so far:
The Grab-Posisi Dataset is an Aspatial dataset, different from the two we prepared above. As such, the pre-processing is slightly different too.
The code below is a chain of dplyr pipes to group the trips by their id and extract the first pingtimestamp row of each trip in order to get the origin of it.
We will need the files in SF format first before we can use it for further geospatial analysis.
Simple feature collection with 0 features and 10 fields
Bounding box: xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
# A tibble: 0 × 11
# Groups: trj_id [0]
# ℹ 11 variables: trj_id <chr>, driving_mode <chr>, osname <chr>,
# pingtimestamp <dttm>, speed <dbl>, bearing <int>, accuracy <dbl>,
# weekday <ord>, start_hr <fct>, day <fct>, geometry <GEOMETRY [m]>
Simple feature collection with 0 features and 10 fields
Bounding box: xmin: NA ymin: NA xmax: NA ymax: NA
Projected CRS: SVY21 / Singapore TM
# A tibble: 0 × 11
# Groups: trj_id [0]
# ℹ 11 variables: trj_id <chr>, driving_mode <chr>, osname <chr>,
# pingtimestamp <dttm>, speed <dbl>, bearing <int>, accuracy <dbl>,
# weekday <ord>, end_hr <fct>, day <fct>, geometry <GEOMETRY [m]>
No missing values, we are almost ready.
To verify that the points that we removed is indeed from the islands, here’s a chunk of code to prove:
# Finding out points removed
diff_id <- origin_sf$trj_id[!(origin_sf$trj_id %in% origin_sf_new$trj_id)]
# Extracting full information of these points
outliers <- origin_sf[(origin_sf$trj_id %in% diff_id), ]
# Checking where these places are from
unique(st_intersection(outliers, mpsz2019)$PLN_AREA_N)[1] "WESTERN ISLANDS" "SOUTHERN ISLANDS"
They are indeed from “WESTERN ISLANDS” and “SOUTHERN ISLANDS”.
Now that our grab dataset is almost ready, we need to decide which column we should drop. Here are the columns in both origin_sf_new and dest_sf_new:
[1] "trj_id" "driving_mode" "osname" "pingtimestamp"
[5] "speed" "bearing" "accuracy" "weekday"
[9] "start_hr" "day" "SUBZONE_N" "SUBZONE_C"
[13] "PLN_AREA_N" "PLN_AREA_C" "REGION_N" "REGION_C"
[17] "geometry"
[1] "trj_id" "driving_mode" "osname" "pingtimestamp"
[5] "speed" "bearing" "accuracy" "weekday"
[9] "end_hr" "day" "SUBZONE_N" "SUBZONE_C"
[13] "PLN_AREA_N" "PLN_AREA_C" "REGION_N" "REGION_C"
[17] "geometry"
We will definitely be dropping the columns merged from mpsz2019_new (other than PLN_AREA_N), but what about “driving_mode”, “osname”, “speed”, “bearing”, and “accuracy”? Let’s first take a look at them.
Seeing that there is only 1 constant in the column, it is safe for us to drop this column.
There are 2 values, mainly “ios” and “android”. Arguments can be made that we can analyse the behavior of both type in terms of using grab hailing services, but that’s not what we will doing so we will drop this as well.
As we are analysing start/stop points, speed will not be a relevant factor hence we will be dropping them.
Not relevant as well, hence dropping.
According to research paper published on Grab website, this is the definition of the accuracy column:
“…the accuracy level roughly indicates the radius of the circle within which the true location lies with a certain probability. The lower the accuracy level, the more precise the reported GPS ping is.”
With that, let’s take a look at the distribution of accuracy score.


From the plot, we can see that there are 3 clear outliers with accuracy above 180~ for origin_sf_new, and 1 for dest_sf_new. Now let’s extract these trips.
Simple feature collection with 3 features and 16 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 18132.25 ymin: 30203 xmax: 28937.76 ymax: 36948.91
Projected CRS: SVY21 / Singapore TM
# A tibble: 3 × 17
trj_id driving_mode osname pingtimestamp speed bearing accuracy
<chr> <chr> <chr> <dttm> <dbl> <int> <dbl>
1 78815 car ios 2019-04-21 13:20:13 0.000000101 68 200
2 67866 car ios 2019-04-18 16:46:16 -1 13 547
3 4579 car ios 2019-04-21 10:35:22 13.0 108 200
# ℹ 10 more variables: weekday <ord>, start_hr <fct>, day <fct>,
# SUBZONE_N <chr>, SUBZONE_C <chr>, PLN_AREA_N <chr>, PLN_AREA_C <chr>,
# REGION_N <chr>, REGION_C <chr>, geometry <POINT [m]>
Simple feature collection with 1 feature and 16 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 33721.09 ymin: 34502.5 xmax: 33721.09 ymax: 34502.5
Projected CRS: SVY21 / Singapore TM
# A tibble: 1 × 17
trj_id driving_mode osname pingtimestamp speed bearing accuracy weekday
<chr> <chr> <chr> <dttm> <dbl> <int> <dbl> <ord>
1 68340 car ios 2019-04-12 11:55:48 -1 10 1414 Fri
# ℹ 9 more variables: end_hr <fct>, day <fct>, SUBZONE_N <chr>,
# SUBZONE_C <chr>, PLN_AREA_N <chr>, PLN_AREA_C <chr>, REGION_N <chr>,
# REGION_C <chr>, geometry <POINT [m]>
To ensure that our data is of utmost accuracy, we will drop these trips, before we drop the accuracy column as well (as we will not need it anymore).
With that done, we can now drop the columns that we don’t need.
Lastly, let’s check for duplicated points on the map.
[1] FALSE
[1] 0
No duplicated points!
It is important for the data to be in the right coordinate reference system (CRS). In this assignment, all spatial data will be projected in EPSG:3414, which is a projected coordinate system for Singapore.
Coordinate Reference System:
User input: EPSG:3414
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
Coordinate Reference System:
User input: EPSG:3414
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
Coordinate Reference System:
User input: EPSG:3414
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
Coordinate Reference System:
User input: EPSG:3414
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
They are all in the correct CRS!
Finally, plotting all three datasets together to ensure that they have a consistent projection system.
Before we begin our Geospatial Analysis, let’s first take a closer look at the Grab dataset.
The distribution of the trips across all 7 days of the week looks even.
First let us look at the top 10 planning areas for grab ride origin points. Tampines is the Planning Area with the most origin points.
origin_pl_area <- origin_sf_new %>%
group_by(PLN_AREA_N) %>%
summarise(total_count=n()) %>%
top_n(10, total_count) %>%
.$PLN_AREA_N
ggplot(origin_sf_new[origin_sf_new$PLN_AREA_N %in% origin_pl_area,],
aes(x=PLN_AREA_N)) + geom_bar() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
labs(title = "Trips Origin Distribution by Planning Area",
x = "Planning Area",
y = "Number of Trips")
Then for the destination points.
dest_pl_area <- dest_sf_new %>%
group_by(PLN_AREA_N) %>%
summarise(total_count=n()) %>%
top_n(10, total_count) %>%
.$PLN_AREA_N
ggplot(dest_sf_new[dest_sf_new$PLN_AREA_N %in% dest_pl_area,],
aes(x=PLN_AREA_N)) + geom_bar() +
theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1)) +
labs(title = "Trips Destination Distribution by Planning Area",
x = "Planning Area",
y = "Number of Trips")
6 out of 10 of the Planning Areas remains the same for destination points, mainly TAMPINES, WOODLANDS, YISHUN, QUEENSTOWN, BUKIT MERAH, and CHANGI. This time however, the Planning Area with the most destination points is Changi.

From the graph, we can see that the starting hour peaks at midnight (12am - 1am) and morning (9am - 10am), the former probably due to the lack of public transport after operating hours, and the latter from rush hour.
In the code chunk below, as.ppp() function is used to derive a ppp object layer directly from a sf tibble data.frame.
Marked planar point pattern: 27872 points
Average intensity 2.636568e-05 points per square unit
Coordinates are given to 3 decimal places
i.e. rounded to the nearest multiple of 0.001 units
marks are of type 'character'
Summary:
Length Class Mode
27872 character character
Window: rectangle = [3661.47, 49845.23] x [26795.39, 49685.08] units
(46180 x 22890 units)
Window area = 1057130000 square units
Marked planar point pattern: 27820 points
Average intensity 2.642188e-05 points per square unit
Coordinates are given to 3 decimal places
i.e. rounded to the nearest multiple of 0.001 units
marks are of type 'character'
Summary:
Length Class Mode
27820 character character
Window: rectangle = [3638.69, 50024.92] x [26770.54, 49469.41] units
(46390 x 22700 units)
Window area = 1052920000 square units
In the code chunk as.owin() is used to create an owin object class from polygon sf tibble data.frame. In this case, we will be converting the sg_boundary polygon.
We will now combine singapore’s boundary and the origin and destination points into one.
Marked planar point pattern: 27820 points
Average intensity 4.185996e-05 points per square unit
Coordinates are given to 3 decimal places
i.e. rounded to the nearest multiple of 0.001 units
marks are of type 'character'
Summary:
Length Class Mode
27820 character character
Window: polygonal boundary
37 separate polygons (29 holes)
vertices area relative.area
polygon 1 12666 6.63014e+08 9.98e-01
polygon 2 285 1.61128e+06 2.42e-03
polygon 3 27 1.50315e+04 2.26e-05
polygon 4 (hole) 41 -4.01660e+04 -6.04e-05
polygon 5 (hole) 317 -5.11280e+04 -7.69e-05
polygon 6 (hole) 3 -4.14099e-04 -6.23e-13
polygon 7 30 2.80002e+04 4.21e-05
polygon 8 (hole) 4 -2.86396e-01 -4.31e-10
polygon 9 (hole) 3 -1.81439e-04 -2.73e-13
polygon 10 (hole) 3 -8.68789e-04 -1.31e-12
polygon 11 (hole) 3 -5.99535e-04 -9.02e-13
polygon 12 (hole) 3 -3.04561e-04 -4.58e-13
polygon 13 (hole) 3 -4.46076e-04 -6.71e-13
polygon 14 (hole) 3 -3.39794e-04 -5.11e-13
polygon 15 (hole) 3 -4.52043e-05 -6.80e-14
polygon 16 (hole) 3 -3.90173e-05 -5.87e-14
polygon 17 (hole) 3 -9.59850e-05 -1.44e-13
polygon 18 (hole) 4 -2.54488e-04 -3.83e-13
polygon 19 (hole) 4 -4.28453e-01 -6.45e-10
polygon 20 (hole) 4 -2.18616e-04 -3.29e-13
polygon 21 (hole) 5 -2.44411e-04 -3.68e-13
polygon 22 (hole) 5 -3.64686e-02 -5.49e-11
polygon 23 71 8.18750e+03 1.23e-05
polygon 24 (hole) 6 -8.37554e-01 -1.26e-09
polygon 25 (hole) 38 -7.79904e+03 -1.17e-05
polygon 26 (hole) 3 -3.41897e-05 -5.14e-14
polygon 27 (hole) 3 -3.65499e-03 -5.50e-12
polygon 28 (hole) 3 -4.95057e-02 -7.45e-11
polygon 29 91 1.49663e+04 2.25e-05
polygon 30 (hole) 5 -2.92235e-04 -4.40e-13
polygon 31 (hole) 3 -7.43616e-06 -1.12e-14
polygon 32 (hole) 270 -1.21455e+03 -1.83e-06
polygon 33 (hole) 19 -4.39650e+00 -6.62e-09
polygon 34 (hole) 35 -1.38385e+02 -2.08e-07
polygon 35 (hole) 23 -1.99656e+01 -3.00e-08
polygon 36 71 5.63061e+03 8.47e-06
polygon 37 10 1.99717e+02 3.01e-07
enclosing rectangle: [2667.54, 55941.94] x [21448.47, 50256.33] units
(53270 x 28810 units)
Window area = 664597000 square units
Fraction of frame area: 0.433
The density values of the output range from 0 to 0.000035 which is way too small to comprehend, and it is computed in “number of points per square meter”. Therefore, we are going to use rescale() to covert the unit of measurement from meter to kilometer.
We will first compute the kernel density by using density() of the spatstat package, with the default method bw.diggle().

Looking at all the different methods, we can see that bw.diggle() is still the best among the automatic bandwidth selection method.
Having tried automatic bandwidth selection method, let’s try computing KDE by using a fixed bandwidth defined by us. In our case, we will define a fixed bandwidth of 800m (or 0.8km).
Fixed bandwidth method, however, is very sensitive to highly skewed distribution of spatial point patterns over geographical units, for example urban versus rural. To overcome this, we can try using adaptive bandwidth instead.
As the KDE layer using fixed bandwidth with gaussian kernel plots a graph that allows for meaningful analysis at a glance, we will be using that for the steps moving forward.



In order for us to map the KDE layer of these points to our map, we first need to convert it into grid object.
We will then convert the gridded kernel density objects into RasterLayer object by using raster() of raster package. As the RasterLayer object does not include CRS information, we will need to manually assign it to them as well.
class : RasterLayer
dimensions : 128, 128, 16384 (nrow, ncol, ncell)
resolution : 0.4162063, 0.2250614 (x, y)
extent : 2.667538, 55.94194, 21.44847, 50.25633 (xmin, xmax, ymin, ymax)
crs : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +datum=WGS84 +units=km +no_defs
source : memory
names : v
values : -1.923671e-14, 596.2208 (min, max)
class : RasterLayer
dimensions : 128, 128, 16384 (nrow, ncol, ncell)
resolution : 0.4162063, 0.2250614 (x, y)
extent : 2.667538, 55.94194, 21.44847, 50.25633 (xmin, xmax, ymin, ymax)
crs : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +datum=WGS84 +units=km +no_defs
source : memory
names : v
values : -7.516604e-15, 520.156 (min, max)
To further explore the map, we will now be overlaying the KDE layer both onto OpenStreetMap of Singapore, and also on the Singapore Planning Area layer and OSM road layer that we have pre-processed.
As you can see from the plot, there are certain planning areas that are hotspots for hailing of Grab ride service, in particular Central Region (Orchard, Newton etc), Woodlands, Punggol, Tampines, and most notably Changi (where the airport lies).
To further confirm our observation, let’s plot the KDE layer over our Planning Area and OSM Road Layers.
The common overlapping Planning Areas include “TAMPINES”, “CHANGI”, “WOODLANDS”, and “NOVENA”, so let’s do a further analysis on these areas.
To do in-depth KDE computation on these 4 planning areas, we will first need to extract their respective boundaries. In the code below, we extracted their boundaries and converted them to sp’s Spatial* class.
Plotting down these boundaries.

Turning the spatial point data frame into generic sp format, then into owin object as done previously.
By using the code below, we will be able to extract grab origin and destination points for these specific areas.
Next up is the rescale() function used previously as well.
origin_cg_ppp.km = rescale(origin_cg_ppp, 1000, "km")
origin_tp_ppp.km = rescale(origin_tp_ppp, 1000, "km")
origin_wl_ppp.km = rescale(origin_wl_ppp, 1000, "km")
origin_nv_ppp.km = rescale(origin_nv_ppp, 1000, "km")
dest_cg_ppp.km = rescale(dest_cg_ppp, 1000, "km")
dest_tp_ppp.km = rescale(dest_tp_ppp, 1000, "km")
dest_wl_ppp.km = rescale(dest_wl_ppp, 1000, "km")
dest_nv_ppp.km = rescale(dest_nv_ppp, 1000, "km")Finally, we plot the four planning areas and the grab hailing origin and destination points
par(mfrow=c(2,4))
plot(origin_cg_ppp.km, main = "CHANGI ORIGIN")
plot(origin_tp_ppp.km, main = "TAMPINES ORIGIN")
plot(origin_wl_ppp.km, main = "WOODLANDS ORIGIN")
plot(origin_nv_ppp.km, main = "NOVENA ORIGIN")
plot(dest_cg_ppp.km, main = "CHANGI DESTINATION")
plot(dest_tp_ppp.km, main = "TAMPINES DESTINATION")
plot(dest_wl_ppp.km, main = "WOODLANDS DESTINATION")
plot(dest_nv_ppp.km, main = "NOVENA DESTINATION")
We will now be computing the KDE of each planning area using the fixed bandwidth method.


The hotspot in Changi area is centered around Changi Airport, indicating a likely surge in use of Grab services due to the constant flow of passengers arriving and departing from Singapore.


The hotspot in Tampines area is mainly concentrated around the stretch from Tampines West to Tampines East, encompassing the bulk of where most residents of Tampines currently live (Tampines West, Tampines, Tampines East).


The rides are concentrated around the lower half of Woodlands area, ranging from Woodlands West to Woodlands South, then Woodlands East. However, one prominent hotspot shared across both the origin and destination map is the Woodlands West region, indicating that this might either be the area with the wealthiest residents in Woodlands, or that there are just more residents concentrated here.


The Novena area’s notable hotspots present an interesting distinction. Origin points predominantly converge around the affluent Moulmein area, revealing a concentration in the wealthier section of town. Conversely, the destination points gravitate towards the Malcolm area, characterized by a cluster of prestigious schools, as illustrated in the figure below.


In this section, we will perform the Clark-Evans test of aggregation for a spatial point pattern by using clarkevans.test() of statspat package, to test whether the distribution of Grab ride hailing origin points are randomly distributed.
Using 95% confidence interval, the test hypotheses are:
Ho = The distribution of Grab ride hailing origin points are randomly distributed.
H1= The distribution of Grab ride hailing origin points are not randomly distributed.
For this section, we will be making use of the ppp object.
Having performed the Clark-Evans Test on all 4 planning area and Singapore as a whole, all of their p-values are <2.2e-16 < 0.05, thus we reject Ho. This means that the distribution of Grab ride hailing origin points are not randomly distributed which we explored in earlier sections.
Furthermore, as their R value ranges from 0.11647 to 0.35838 which is <1, this suggests that the points are clustering.
In this section, we will be using appropriate functions of spNetwork package:
where in this case the network refers to OSM’s Road Map of Singapore.
However, due to limitations in computational power, we will be limiting the area of scope down to the 4 areas identified in the previous section, Changi, Tampines, Woodlands, and Novena, and only the Origin points.
Before we begin, let us first convert our sg_roads_new data from SFC_GEOMETRY to SFC_LINESTRING.
Then, let us narrow down the scope of our data to the 4 areas mentioned.
# Roads
cg_roads <- sg_roads_linestring %>% filter(PLN_AREA_N == "CHANGI")
tp_roads <- sg_roads_linestring %>% filter(PLN_AREA_N == "TAMPINES")
wl_roads <- sg_roads_linestring %>% filter(PLN_AREA_N == "WOODLANDS")
nv_roads <- sg_roads_linestring %>% filter(PLN_AREA_N == "NOVENA")
# Grab Origin Points
cg_origin <- origin_sf_new %>% filter(PLN_AREA_N == "CHANGI")
tp_origin <- origin_sf_new %>% filter(PLN_AREA_N == "TAMPINES")
wl_origin <- origin_sf_new %>% filter(PLN_AREA_N == "WOODLANDS")
nv_origin <- origin_sf_new %>% filter(PLN_AREA_N == "NOVENA")Before we begin our analysis, let us visualise our geospatial data to make sure everything falls into place.
We will now perform NetKDE analysis by using appropriate functions provided in spNetwork package.
Next, we will use lines_center() of spNetwork to generate a SpatialPointsDataFrame (i.e. samples) with line centre points.
We are now ready to compute NetKDE. As the code is fairly long, we will split it into 4 tabs.
# Origin
cg_o_densities <- nkde(cg_roads,
events = cg_origin,
w = rep(1,nrow(cg_origin)),
samples = cg_lines_center,
kernel_name = "quartic", # kernel method
bw = 300,
div= "bw",
method = "simple",
# method used to calculate NKDE. spNetwork supports 3 popular methods, namely simple, discontinuous, and continuous
digits = 1,
tol = 1,
grid_shape = c(1,1),
max_depth = 8,
agg = 5,
# we aggregate events within a 5m radius (faster calculation)
sparse = TRUE,
verbose = FALSE)tp_o_densities <- nkde(tp_roads,
events = tp_origin,
w = rep(1,nrow(tp_origin)),
samples = tp_lines_center,
kernel_name = "quartic", # kernel method
bw = 300,
div= "bw",
method = "simple",
# method used to calculate NKDE. spNetwork supports 3 popular methods, namely simple, discontinuous, and continuous
digits = 1,
tol = 1,
grid_shape = c(1,1),
max_depth = 8,
agg = 5,
# we aggregate events within a 5m radius (faster calculation)
sparse = TRUE,
verbose = FALSE)wl_o_densities <- nkde(wl_roads,
events = wl_origin,
w = rep(1,nrow(wl_origin)),
samples = wl_lines_center,
kernel_name = "quartic", # kernel method
bw = 300,
div= "bw",
method = "simple",
# method used to calculate NKDE. spNetwork supports 3 popular methods, namely simple, discontinuous, and continuous
digits = 1,
tol = 1,
grid_shape = c(1,1),
max_depth = 8,
agg = 5,
# we aggregate events within a 5m radius (faster calculation)
sparse = TRUE,
verbose = FALSE)nv_o_densities <- nkde(nv_roads,
events = nv_origin,
w = rep(1,nrow(nv_origin)),
samples = nv_lines_center,
kernel_name = "quartic", # kernel method
bw = 300,
div= "bw",
method = "simple",
# method used to calculate NKDE. spNetwork supports 3 popular methods, namely simple, discontinuous, and continuous
digits = 1,
tol = 1,
grid_shape = c(1,1),
max_depth = 8,
agg = 5,
# we aggregate events within a 5m radius (faster calculation)
sparse = TRUE,
verbose = FALSE)Before we are able to visualise, we first need to insert the computed values back into lines_center and lixels objects as density field.
cg_lines_center$o_density <- cg_o_densities
cg_lixels$o_density <- cg_o_densities
tp_lines_center$o_density <- tp_o_densities
tp_lixels$o_density <- tp_o_densities
wl_lines_center$o_density <- wl_o_densities
wl_lixels$o_density <- wl_o_densities
nv_lines_center$o_density <- nv_o_densities
nv_lixels$o_density <- nv_o_densitiesSince svy21 projection system is in meter, the computed density values are very small i.e. 0.0000005. We will thus need to rescale the density values from number of events per meter to number of events per kilometer.
cg_lines_center$o_density <- cg_lines_center$o_density*1000
cg_lixels$o_density <- cg_lixels$o_density*1000
tp_lines_center$o_density <- tp_lines_center$o_density*1000
tp_lixels$o_density <- tp_lixels$o_density*1000
wl_lines_center$o_density <- wl_lines_center$o_density*1000
wl_lixels$o_density <- wl_lixels$o_density*1000
nv_lines_center$o_density <- nv_lines_center$o_density*1000
nv_lixels$o_density <- nv_lixels$o_density*1000This tmap plot further reinforces our observation above that the grab ride traffic are from incoming tourists or locals returning home form the airport, as you can see the denser area being the Changi Airport Terminals. However, it is worth highlighting that there some slight traffic along the Changi Village area and infront of the Japanese School as well.
As we have discovered earlier, a huge portion of the grab rides indeed originated from Tampines East, one of the more populated area of Tampines. Particularly along Tampines Avenue 2, there seems to be a higher density, presumably due to it being more convenient to get a ride along the main road.
Surprisingly, the other higher density area in this network density map is the area around Changi General Hospital.
There are 3 main points of to focus on with higher density, mainly:
Along the route to Woodlands Checkpoint, showing that a significant portion of the rides in Woodlands are people coming in from Malaysia.
Around the main hub of Woodlands, along the Woodlands MRT stretch. No surprises here, as the area is perhaps the most dense in terms of human traffic due to concentration of malls, bus interchange, and MRT station.
3 different points around the Sembawang Air Base, which I assume is the entrance. This make sense as well, as military bases in Singapore are generally more inaccessible.
Network KDE indicates that the majority of the traffic is along Moulmein Road, which is the main road next to several of the moderately wealthier estates in Singapore.
We are now going to perform complete spatial randomness (CSR) test by using kfunctions() of spNetwork package. The null hypothesis is defined as:
The CSR test is based on the assumption of the binomial point process which implies the hypothesis that the childcare centres are randomly and independently distributed over the street network.
If this hypothesis is rejected, we may infer that the distribution of Grab ride hailing points are spatially interacting and dependent on each other; as a result, they may form nonrandom patterns.
kfun_cg <- kfunctions(cg_roads,
cg_origin[c("trj_id","PLN_AREA_N", "geometry")],
start = 0,
# A double, the start value for evaluating the k and g functions.
end = 1000,
# A double, the last value for evaluating the k and g functions.
step = 50,
# A double, the jump between two evaluations of the k and g function
width = 50,
# The width of each donut for the g-function
nsim = 50,
# number of Monte Carlo simulations required.
resolution = 50,
verbose = FALSE,
agg = 5,
conf_int = 0.05
# A double indicating the width confidence interval (default = 0.05).
)
kfun_cg$plotk

$plotg

$values
obs_k lower_k upper_k obs_g lower_g upper_g distances
1 0.00000 0.000000 0.000000 14.86145 0.469979 0.699525 0
2 36.75689 2.071635 2.493084 43.40340 2.385692 3.014545 50
3 79.23030 4.732819 5.655247 41.90507 2.872094 3.659082 100
4 121.95096 7.804382 9.407329 42.98638 3.276383 4.141424 150
5 166.35078 11.353488 13.743794 47.23408 3.816849 4.716212 200
6 214.49640 15.327735 18.545993 49.64395 4.207299 5.310006 250
7 264.37654 20.027709 23.930915 49.89859 4.695176 5.779985 300
8 316.81416 25.048015 29.823643 52.19405 5.044847 6.328201 350
9 369.73892 30.447883 36.084309 54.99879 5.465374 6.902251 400
10 423.85570 36.321051 43.188058 52.66643 5.899002 7.353593 450
11 477.36355 42.772513 50.406950 55.36046 6.308642 7.794971 500
12 531.87151 49.481200 58.743679 54.89546 6.821983 8.466264 550
13 587.96637 56.801948 67.484697 55.21653 6.954102 8.981451 600
14 643.18290 64.214957 76.791277 55.12058 7.679091 9.165420 650
15 698.60978 72.137433 86.589055 53.94332 8.022856 10.115896 700
16 749.05825 80.697804 97.232501 53.11297 8.626799 10.735707 750
17 802.93515 89.438453 107.939238 49.17157 8.847488 11.075228 800
18 852.30969 98.593170 118.900062 50.21966 9.101760 11.448333 850
19 900.47377 108.443538 130.051499 47.30420 9.810326 12.081430 900
20 946.48263 118.621802 141.719229 45.53648 9.787630 12.499743 950
21 991.41387 128.456301 153.717255 42.06376 10.233252 13.006442 1000
The blue line represents the empirical network K-function of the Grab ride hailing origin points in Changi planning area. The gray envelop represents the results of the 50 simulations in the interval 2.5% - 97.5%. Because the blue line is above the gray area, we can infer that these origin points in Changi planning area are in clusters, which reinforces our observations made above.
kfun_tp <- kfunctions(tp_roads,
tp_origin[c("trj_id","PLN_AREA_N", "geometry")],
start = 0,
# A double, the start value for evaluating the k and g functions.
end = 1000,
# A double, the last value for evaluating the k and g functions.
step = 50,
# A double, the jump between two evaluations of the k and g function
width = 50,
# The width of each donut for the g-function
nsim = 50,
# number of Monte Carlo simulations required.
resolution = 50,
verbose = FALSE,
agg = 10,
conf_int = 0.05
# A double indicating the width confidence interval (default = 0.05).
)
kfun_tp$plotk

$plotg

$values
obs_k lower_k upper_k obs_g lower_g upper_g distances
1 0.00000 0.000000 0.000000 3.544369 0.4598842 0.6893691 0
2 10.43806 1.567842 2.007917 14.896712 1.8537080 2.3743920 50
3 26.84334 3.751302 4.504975 17.212894 2.5004106 2.9548091 100
4 44.52556 6.603559 7.823364 17.877273 3.0124089 3.8264252 150
5 63.16474 10.047357 11.832036 19.330982 3.5387309 4.5878702 200
6 82.52925 13.931382 16.675449 18.724508 4.3155664 5.2296967 250
7 102.07051 18.658832 22.324191 20.833454 5.1176970 6.3144930 300
8 122.50168 24.357404 28.872893 20.988882 5.9496942 7.0899570 350
9 144.84065 31.030753 36.202545 22.921066 6.7297297 8.1939533 400
10 167.65810 38.492061 44.761906 22.872304 7.8413450 9.3712446 450
11 190.66450 46.944756 54.900843 23.615920 8.6893421 10.5224788 500
12 215.64575 56.440495 65.928768 25.459724 9.7416814 11.6025513 550
13 241.81557 67.025298 77.654748 26.828100 10.7675065 12.6644906 600
14 268.85090 78.227241 90.313295 27.711907 11.5769514 14.0491717 650
15 297.63862 90.100725 104.801932 29.711138 12.4683770 15.4370527 700
16 329.33375 103.370318 119.924472 31.756083 13.9028864 16.5669537 750
17 360.65707 117.417814 137.285632 32.469224 14.7460074 17.8314065 800
18 393.75715 132.962447 155.632694 35.346167 15.6151854 18.6562418 850
19 429.61227 149.358125 174.804896 35.693594 16.8448956 20.0186753 900
20 466.66205 166.587781 195.704483 39.131297 17.7242831 21.4947847 950
21 506.48820 184.965776 217.092144 39.405581 19.0678214 22.4637719 1000
Similar to Changi planning area, as the blue line is above the grey area, we can infer that the Tampines planning area consists of mainly origin points in clusters.
kfun_wl <- kfunctions(wl_roads,
wl_origin[c("trj_id","PLN_AREA_N", "geometry")],
start = 0,
# A double, the start value for evaluating the k and g functions.
end = 1000,
# A double, the last value for evaluating the k and g functions.
step = 50,
# A double, the jump between two evaluations of the k and g function
width = 50,
# The width of each donut for the g-function
nsim = 50,
# number of Monte Carlo simulations required.
resolution = 50,
verbose = FALSE,
agg = 5,
conf_int = 0.05
# A double indicating the width confidence interval (default = 0.05).
)
kfun_wl$plotk

$plotg

$values
obs_k lower_k upper_k obs_g lower_g upper_g distances
1 0.00000 0.000000 0.000000 12.80481 0.8495718 1.211788 0
2 30.08326 2.693564 3.314096 35.47919 3.3915903 4.182993 50
3 67.77057 6.513384 7.937374 39.66965 4.2430755 5.432093 100
4 108.36633 11.328773 13.428019 41.09708 5.5371418 6.627234 150
5 149.89203 17.488168 20.821972 42.13034 6.5418945 8.007406 200
6 193.36561 24.861648 28.989725 44.11650 7.5577451 9.403076 250
7 237.89542 33.423955 39.010272 45.81948 9.0860177 11.267925 300
8 285.19973 43.385950 50.619748 47.25456 10.5364128 12.781464 350
9 332.27061 55.214898 64.442320 48.52127 12.1528955 14.807425 400
10 383.29084 68.489458 80.236128 52.89924 14.1765603 17.076394 450
11 437.26544 84.123876 97.770602 54.76294 16.4191238 19.332352 500
12 492.18911 101.740054 117.890147 55.70436 18.0949233 22.019792 550
13 547.99297 121.325938 141.371049 57.94692 20.1658504 24.419450 600
14 608.33170 142.489844 166.829311 60.72525 22.4774894 26.949414 650
15 669.64247 166.760427 195.354834 62.01109 24.7108683 29.310420 700
16 732.69448 191.857238 226.072787 64.49475 26.4677982 31.845932 750
17 799.31698 219.352750 259.319748 68.91482 28.0479253 34.129635 800
18 871.09049 248.581848 294.996790 73.95102 30.5499526 36.948721 850
19 945.99058 280.807792 332.762171 76.33518 33.0565722 39.630995 900
20 1024.72905 314.979523 373.985157 81.14559 34.6833874 42.052466 950
21 1109.01653 350.745349 417.850426 86.31191 37.6017815 44.339613 1000
Similar to Changi planning area, as the blue line is above the grey area, we can infer that the Woodlands planning area consists of mainly origin points in clusters.
kfun_nv <- kfunctions(nv_roads,
nv_origin[c("trj_id","PLN_AREA_N", "geometry")],
start = 0,
# A double, the start value for evaluating the k and g functions.
end = 1000,
# A double, the last value for evaluating the k and g functions.
step = 50,
# A double, the jump between two evaluations of the k and g function
width = 50,
# The width of each donut for the g-function
nsim = 50,
# number of Monte Carlo simulations required.
resolution = 50,
verbose = FALSE,
agg = 5,
conf_int = 0.05
# A double indicating the width confidence interval (default = 0.05).
)
kfun_nv$plotk

$plotg

$values
obs_k lower_k upper_k obs_g lower_g upper_g distances
1 0.000000 0.0000000 0.0000000 2.070894 0.08472941 0.1845110 0
2 4.899254 0.3204665 0.4990295 5.799959 0.41709206 0.6252739 50
3 10.932279 0.7748250 1.2084259 5.931059 0.54867756 0.8215597 100
4 16.831777 1.3828617 2.0543847 5.841231 0.66484668 1.0406665 150
5 22.590463 2.3091799 3.1961682 6.110714 0.83734022 1.2532184 200
6 28.919677 3.2960712 4.5303529 6.732225 1.05049903 1.5240369 250
7 36.047630 4.6371751 6.1328070 7.460558 1.28793560 1.8402548 300
8 43.869927 6.1596339 8.0875315 8.091780 1.54952855 2.1862130 350
9 52.602640 7.8951302 10.4276656 8.999769 1.74872764 2.5032807 400
10 61.765070 9.8237561 13.1721459 9.453763 1.90179896 2.6397217 450
11 70.978483 12.0261138 16.0894839 9.152719 2.18669857 3.0627618 500
12 80.264729 14.4184451 19.2424376 9.232835 2.44562097 3.4005870 550
13 89.281492 17.0621726 22.7227771 9.490180 2.72894252 3.7030880 600
14 99.725788 19.8723243 26.7797135 10.415163 2.89985801 4.0240401 650
15 110.301184 23.0399661 31.0014960 11.077946 3.11240987 4.3909985 700
16 121.383985 26.2260590 35.6282316 11.230896 3.43396891 4.6671581 750
17 132.750836 29.7339537 40.3180895 11.612056 3.67577548 5.0433422 800
18 144.807176 33.6640388 45.4670399 11.956801 3.79073071 5.3430512 850
19 156.674149 37.7068941 50.8213803 12.357384 3.99672757 5.6451880 900
20 169.582638 41.8247678 56.4447183 13.102711 4.10658447 5.6922869 950
21 182.794599 46.0130469 62.3314701 13.190111 4.40617213 5.8928213 1000
Similar to Changi planning area, as the blue line is above the grey area, we can infer that the Novena planning area consists of mainly origin points in clusters.
The results of our G- and K-Function Analysis on all four planning area shows a spatial pattern of clustering among the grab origin points, which supports the idea that grab rides are commonly booked at the same location within an area, possibly due to designated pickup points or taxi stands.
In conclusion, our analysis of Grab ride-hailing origin points in the specific planning areas of Changi, Tampines, Woodlands, and Novena, and also the whole of Singapore uncovered noteworthy spatial patterns. The observed clustering of origin points within these areas suggests a localized preference for specific pickup locations, potentially driven by factors such as designated pickup points, popular landmarks, transportation hubs, or simply area with higher population density.
These findings hold practical implications for both Grab and urban planners as the identified clusters can guide Grab in optimizing their service by strategically placing vehicles or promoting the use of specific pickup points, ultimately enhancing the efficiency and user experience. Urban planners, on the other hand, can leverage this information to make informed decisions regarding infrastructure development, such as improving the accessibility of popular pickup locations or adjusting traffic flow in areas with high ride-hailing activity.
Moreover, understanding the spatial dynamics of Grab ride-hailing services contributes to a broader perspective on urban mobility patterns. This knowledge can be valuable for city officials, transportation authorities, and policymakers in crafting policies that support sustainable and efficient transportation solutions. By aligning urban planning efforts with the observed ride-hailing patterns, cities can work towards creating more resilient, user-friendly, and accessible transportation systems.
In essence, our analysis not only sheds light on the localized behaviors of Grab users but also opens avenues for strategic decision-making that can enhance the overall urban mobility landscape. As technology continues to shape the future of transportation, such spatial insights play a crucial role in fostering innovation and creating urban environments that are responsive to the evolving needs of their residents.